19 research outputs found
Contextual Out-of-Domain Utterance Handling With Counterfeit Data Augmentation
Neural dialog models often lack robustness to anomalous user input and
produce inappropriate responses which leads to frustrating user experience.
Although there are a set of prior approaches to out-of-domain (OOD) utterance
detection, they share a few restrictions: they rely on OOD data or multiple
sub-domains, and their OOD detection is context-independent which leads to
suboptimal performance in a dialog. The goal of this paper is to propose a
novel OOD detection method that does not require OOD data by utilizing
counterfeit OOD turns in the context of a dialog. For the sake of fostering
further research, we also release new dialog datasets which are 3 publicly
available dialog corpora augmented with OOD turns in a controllable way. Our
method outperforms state-of-the-art dialog models equipped with a conventional
OOD detection mechanism by a large margin in the presence of OOD utterances.Comment: ICASSP 201
Challenging Neural Dialogue Models with Natural Data: Memory Networks Fail on Incremental Phenomena
Natural, spontaneous dialogue proceeds incrementally on a word-by-word basis;
and it contains many sorts of disfluency such as mid-utterance/sentence
hesitations, interruptions, and self-corrections. But training data for machine
learning approaches to dialogue processing is often either cleaned-up or wholly
synthetic in order to avoid such phenomena. The question then arises of how
well systems trained on such clean data generalise to real spontaneous
dialogue, or indeed whether they are trainable at all on naturally occurring
dialogue data. To answer this question, we created a new corpus called bAbI+ by
systematically adding natural spontaneous incremental dialogue phenomena such
as restarts and self-corrections to the Facebook AI Research's bAbI dialogues
dataset. We then explore the performance of a state-of-the-art retrieval model,
MemN2N, on this more natural dataset. Results show that the semantic accuracy
of the MemN2N model drops drastically; and that although it is in principle
able to learn to process the constructions in bAbI+, it needs an impractical
amount of training data to do so. Finally, we go on to show that an
incremental, semantic parser -- DyLan -- shows 100% semantic accuracy on both
bAbI and bAbI+, highlighting the generalisation properties of linguistically
informed dialogue models.Comment: 9 pages, 3 figures, 2 tables. Accepted as a full paper for SemDial
201
Neural Response Ranking for Social Conversation: A Data-Efficient Approach
The overall objective of 'social' dialogue systems is to support engaging,
entertaining, and lengthy conversations on a wide variety of topics, including
social chit-chat. Apart from raw dialogue data, user-provided ratings are the
most common signal used to train such systems to produce engaging responses. In
this paper we show that social dialogue systems can be trained effectively from
raw unannotated data. Using a dataset of real conversations collected in the
2017 Alexa Prize challenge, we developed a neural ranker for selecting 'good'
system responses to user utterances, i.e. responses which are likely to lead to
long and engaging conversations. We show that (1) our neural ranker
consistently outperforms several strong baselines when trained to optimise for
user ratings; (2) when trained on larger amounts of data and only using
conversation length as the objective, the ranker performs better than the one
trained using ratings -- ultimately reaching a Precision@1 of 0.87. This
advance will make data collection for social conversational agents simpler and
less expensive in the future.Comment: 2018 EMNLP Workshop SCAI: The 2nd International Workshop on
Search-Oriented Conversational AI. Brussels, Belgium, October 31, 201
Data-efficient methods for dialogue systems
Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa or more business-oriented customer support automation
solutions. Deep learning underlies many recent breakthroughs in dialogue systems but requires
very large amounts of training data, often annotated by experts — and this dramatically increases the cost of deploying such systems in production setups and reduces their flexibility as
software products. Trained with smaller data, these methods end up severely lacking robustness
to various phenomena of spoken language (e.g. disfluencies), out-of-domain input, and often
just have too little generalisation power to other tasks and domains.
In this thesis, we address the above issues by introducing a series of methods for bootstrapping
robust dialogue systems from minimal data. Firstly, we study two orthogonal approaches to dialogue: a linguistically informed model (DyLan) and a machine learning-based one (MemN2N) —
from the data efficiency perspective, i.e. their potential to generalise from minimal data and
robustness to natural spontaneous input. We outline the steps to obtain data-efficient solutions
with either approach and proceed with the neural models for the rest of the thesis.
We then introduce the core contributions of this thesis, two data-efficient models for dialogue
response generation: the Dialogue Knowledge Transfer Network (DiKTNet) based on transferable latent dialogue representations, and the Generative-Retrieval Transformer (GRTr) combining response generation logic with a retrieval mechanism as the fallback. GRTr ranked first at
the Dialog System Technology Challenge 8 Fast Domain Adaptation task.
Next, we the problem of training robust neural models from minimal data. As such, we look at
robustness to disfluencies and propose a multitask LSTM-based model for domain-general disfluency detection. We then go on to explore robustness to anomalous, or out-of-domain (OOD)
input. We address this problem by (1) presenting Turn Dropout, a data-augmentation technique
facilitating training for anomalous input only using in-domain data, and (2) introducing VHCN
and AE-HCN, autoencoder-augmented models for efficient training with turn dropout based on
the Hybrid Code Networks (HCN) model family.
With all the above work addressing goal-oriented dialogue, our final contribution in this thesis
focuses on social dialogue where the main objective is maintaining natural, coherent, and engaging conversation for as long as possible. We introduce a neural model for response ranking
in social conversation used in Alana, the 3rd place winner in the Amazon Alexa Prize 2017 and
2018. For our model, we employ a novel technique of predicting the dialogue length as the main
objective for ranking. We show that this approach matches the performance of its counterpart
based on the conventional, human rating-based objective — and surpasses it given more raw
dialogue transcripts, thus reducing the dependence on costly and cumbersome dialogue annotations.EPSRC project BABBLE (grant EP/M01553X/1)
Data-efficient goal-oriented conversation with dialogue knowledge transfer networks
Goal-oriented dialogue systems are now being widely adopted in industry where
it is of key importance to maintain a rapid prototyping cycle for new products
and domains. Data-driven dialogue system development has to be adapted to meet
this requirement --- therefore, reducing the amount of data and annotations
necessary for training such systems is a central research problem.
In this paper, we present the Dialogue Knowledge Transfer Network (DiKTNet),
a state-of-the-art approach to goal-oriented dialogue generation which only
uses a few example dialogues (i.e. few-shot learning), none of which has to be
annotated. We achieve this by performing a 2-stage training. Firstly, we
perform unsupervised dialogue representation pre-training on a large source of
goal-oriented dialogues in multiple domains, the MetaLWOz corpus. Secondly, at
the transfer stage, we train DiKTNet using this representation together with 2
other textual knowledge sources with different levels of generality: ELMo
encoder and the main dataset's source domains.
Our main dataset is the Stanford Multi-Domain dialogue corpus. We evaluate
our model on it in terms of BLEU and Entity F1 scores, and show that our
approach significantly and consistently improves upon a series of baseline
models as well as over the previous state-of-the-art dialogue generation model,
ZSDG. The improvement upon the latter --- up to 10% in Entity F1 and the
average of 3% in BLEU score --- is achieved using only the equivalent of 10% of
ZSDG's in-domain training data.Comment: EMNLP 201
An Ensemble Model with Ranking for Social Dialogue
Open-domain social dialogue is one of the long-standing goals of Artificial
Intelligence. This year, the Amazon Alexa Prize challenge was announced for the
first time, where real customers get to rate systems developed by leading
universities worldwide. The aim of the challenge is to converse "coherently and
engagingly with humans on popular topics for 20 minutes". We describe our Alexa
Prize system (called 'Alana') consisting of an ensemble of bots, combining
rule-based and machine learning systems, and using a contextual ranking
mechanism to choose a system response. The ranker was trained on real user
feedback received during the competition, where we address the problem of how
to train on the noisy and sparse feedback obtained during the competition.Comment: NIPS 2017 Workshop on Conversational A